# Audio feature extraction
Voc2vec Hubert Ls Pt
Apache-2.0
voc2vec is a foundational model specifically designed for non-verbal human data, built on the HuBERT framework and pre-trained on 125 hours of non-verbal audio data.
Audio Classification
Transformers English

V
alkiskoudounas
114
1
Voc2vec As Pt
Apache-2.0
voc2vec is a foundational model specifically designed for non-linguistic human data, built upon the wav2vec 2.0 framework.
Audio Classification
Transformers English

V
alkiskoudounas
31
0
Wav2vec2 Large Robust 24 Ft Age Gender
This model takes raw audio signals as input and outputs age predictions and gender probabilities (child/female/male), along with the pooled state of the last transformer layer.
Audio Classification
Transformers

W
audeering
44.13k
33
Ast Finetuned Audioset 10 10 0.4593 Finetuned Gtzan
Bsd-3-clause
This is an audio classification model based on the AST (Audio Spectrogram Transformer) architecture, fine-tuned on the GTZAN music genre classification dataset with an accuracy rate of 92%.
Audio Classification
Transformers

A
Bhanu9Prakash
50
0
Speech Accent Classification
Apache-2.0
A foundational speech recognition model based on the Wav2Vec2 architecture, trained on 960 hours of English speech data, suitable for speech classification tasks.
Audio Classification
Transformers English

S
dima806
40
4
Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53
Apache-2.0
This model is an automatic speech recognition model fine-tuned on the AI Light Dance dataset based on facebook/wav2vec2-large-xlsr-53.
Speech Recognition
Transformers

A
gary109
26
1
Ai Light Dance Chord Ft Wav2vec2 Large Xlsr 53
Apache-2.0
This model is a fine-tuned automatic speech recognition model based on facebook/wav2vec2-large-xlsr-53 on the GARY109/AI_Light_Dance - ONSET-CHORD2 dataset.
Speech Recognition
Transformers

A
gary109
46
0
Featured Recommended AI Models